Autonomous measurement of speech intelligibility utilizing automatic speech recognition

نویسندگان

  • Bernd T. Meyer
  • Birger Kollmeier
  • Jasper Ooster
چکیده

Measures of speech intelligibility are an essential tool for diagnosing hearing impairment and for tuning hearing aid parameters. This study explores the potential of automatic speech recognition (ASR) for conducting autonomous listening tests. In these tests (e.g., in the Oldenburg sentence matrix test employed here) the responses of participants are usually logged by a (human) supervisor. The target value is the speech reception threshold (SRT), i.e., the signal-to-noise ratio at which 50% speech intelligibility is achieved. We explore what ASR error rates can be obtained for such responses, and how ASR errors affect the measured SRT value. To this end, a speech database was recorded that contains utterances from 20 speakers and covers different levels of language complexity, ranging from simple five-word sentences to utterances as produced in typical human-human interactions during testing. While for the most complex speech material, the achievable SRT accuracy was not satisfactory, the ASR performance for sentences without out-of-vocabulary words was below 1.3% and hence sufficient to obtain a test-retest reliability of only 0.5 dB, which is identical to the reliability in human-supervised tests.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی

In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...

متن کامل

Improving automatic speech recognition performance and speech inteligibility with harmonicity based dereverberation

A speech signal captured by a distant microphone is generally smeared by reverberation, that severely degrades both the speech intelligibility and Automatic Speech Recognition (ASR) performance. Previously, we proposed a novel dereverberation method, named “Harmonicity based dEReverBeration (HERB)”, which estimates the inverse filter of an unknown impulse response by utilizing the inherent spee...

متن کامل

Improving automatic speech recognition performance and speech intelligibility with harmonicity based dereverberation

A speech signal captured by a distant microphone is generally smeared by reverberation, that severely degrades both the speech intelligibility and Automatic Speech Recognition (ASR) performance. Previously, we proposed a novel dereverberation method, named “Harmonicity based dEReverBeration (HERB)”, which estimates the inverse filter of an unknown impulse response by utilizing the inherent spee...

متن کامل

Predicting Automatic Speech Recognition Performance Over Communication Channels from Instrumental Speech Quality and Intelligibility Scores

The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility over transmission channels. Different to previous studies, the effects of super-wideband transmis...

متن کامل

A Database for Automatic Persian Speech Emotion Recognition: Collection, Processing and Evaluation

Abstract   Recent developments in robotics automation have motivated researchers to improve the efficiency of interactive systems by making a natural man-machine interaction. Since speech is the most popular method of communication, recognizing human emotions from speech signal becomes a challenging research topic known as Speech Emotion Recognition (SER). In this study, we propose a Persian em...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015